Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Add LiteLLMEmbeddings - Support SemanticChunking through LiteLLM #154

Open
wants to merge 3 commits into
base: development
Choose a base branch
from

Conversation

Dhan996
Copy link

@Dhan996 Dhan996 commented Jan 25, 2025

added litellm support to identify and embed with any Huggingface embedding model that has feature extraction
added the same to the registry to support AutoEmbeddings.
added tests for the same.
key things to notice:

timeout is dependent on model size, since the model needs to be loaded onto the local hardware first.
context length, dimensions, and such measures are dependent on the model
token_counter is a callable from litellm, which would also need time to load.
currently only supports huggingface models. Litellm can support more models such as voyage, mistral, etc, but the API keys should be given in parameters.

bhavnicksm and others added 2 commits January 29, 2025 20:13
…run timings (chonkie-ai#156)

* [DOCS] Benchmarking update (chonkie-ai#145)

* Add wiki 500k benchmark results

* Update benchmarks

* bahut tej hai chonkie bhai

* blah blah

---------

Co-authored-by: Bhavnick Minhas <[email protected]>

* Update benchmarks with corrected memory usage and size metrics

---------

Co-authored-by: Shreyash Nigam <[email protected]>
@bhavnicksm bhavnicksm changed the base branch from main to development January 29, 2025 14:54
@bhavnicksm bhavnicksm changed the title LiteLLM Integration [Feat] Add LiteLLMEmbeddings - Support SemanticChunking through LiteLLM Jan 29, 2025
EmbeddingsRegistry.register(
"litellm",
LiteLLMEmbeddings,
pattern=r"^litellm/|^huggingface/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dhan996!

I have a doubt on the pattern for litellm; does litellm load litellm/<model-name> properly? Does it host it's own models?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Litellm doesn't have any models of it's own. I just put that as an initial placeholder when users wanted to initiate the litellm handling client, but they shouldn't use this they should specify which model. I will take this out, but would appreciate other ideas. Currently, I'm thinking any huggingface model should be routed to Litellm.

def test_auto_embeddings_litellm(litellm_identifier):
"""Test that the AutoEmbeddings class can get LiteLLM embeddings."""
embeddings = AutoEmbeddings.get_embeddings(
litellm_identifier, api_key="your_litellm_api_key"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dhan996!

Could you have this use OpenAI API Keys and an OpenAI model?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, will amend in the next commit.

@pytest.fixture
def embedding_model():
api_key = os.environ.get("HUGGINGFACE_API_KEY")
return LiteLLMEmbeddings(api_key=api_key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dhan996!

Same as before, could you transform the tests to use OpenAI as the default with additional tests with Hugginface or any other provider to see if that works and conditioned on if their keys are in the environment?

Thanks!

def __init__(
self,
model: str = 'huggingface/microsoft/codebert-base',
input: List[str] = "Hello, my dog is cute",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dhan996!

I believe this input is just for checking the embedding response is coming through right? We don't have to offer the user the option to change this as a part of the signature; we can keep it fixed inside the __init__. It would be a good idea to offer minimal interface for the user as possible.

Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. you'er right. Will amend.

text = [text]
retries = 5 # Number of retries
wait_time = 10 # Wait time between retries
for i in range(retries):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dhan996!

Just a doubt, but does LiteLLM do any retries internally?

If they handle it, then we can push any API retries to their end, o/w we should offer retries as a parameter during init.

Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc, when testing it didn't seem to handle retries, but I will play with again, and if it does, I'll update.

@bhavnicksm
Copy link
Collaborator

Hey @Dhan996!

Sorry for the delay in posting the review~ I had this in my draft for a while but I forgot to submit it 😅

Just a few tiny doubts and changes but otherwise looks good~

Thanks! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEAT] Add an ability to use OpenAI / VoyageAI / Cohere embeddings with SDPMChunker via LiteLLM
2 participants